SARS-CoV-2 booster vaccine dose significantly extends humoral immune response half-life beyond the primary series

SARS-CoV-2 lipid nanoparticle mRNA vaccines continue to be administered as the predominant prophylactic measure to reduce COVID-19 disease pathogenesis. Quantifying the kinetics of the secondary immune response from subsequent doses beyond the primary series and understanding how dose-dependent immune waning kinetics vary as a function of age, sex, and various comorbidities remains an important question. We study anti-spike IgG waning kinetics in 152 individuals who received an mRNA-based primary series (first two doses) and a subset of 137 individuals who then received an mRNA-based booster dose. We find the booster dose elicits a 71–84% increase in the median Anti-S half life over that of the primary series. We find the Anti-S half life for both primary series and booster doses decreases with age. However, we stress that although chronological age continues to be a good proxy for vaccine-induced humoral waning, immunosenescence is likely not the mechanism, rather, more likely the mechanism is related to the presence of noncommunicable diseases, which also accumulate with age, that affect immune regulation. We are able to independently reproduce recent observations that those with pre-existing asthma exhibit a stronger primary series humoral response to vaccination than compared to those that do not, and further, we find this result is sustained for the booster dose. Finally, via a single-variate Kruskal-Wallis test we find no difference between male and female humoral decay kinetics, however, a multivariate approach utilizing Least Absolute Shrinkage and Selection Operator (LASSO) regression for feature selection reveals a statistically significant (p\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<1\times 10^{-3}$$\end{document}<1×10-3), albeit small, bias in favour of longer-lasting humoral immunity amongst males.


S2 Observation, error models, and Lasso regression details
Eq. 1 is fit to individual Anti-S trajectories in Monolix.The observation model used in these fits is given by, where a and b are error model parameters, and e is a sequence of independent random variables normally distributed with mean 0 and variance 1.The individual model used to fit the decay rate, γ j,i , is given by, where η γj,i are the random effects for the decay rate of the jth dose for the ith individual.

Lasso Regression
Our Lasso (Least Absolute Shrinkage and Selection Operator) regression is a form of linear regression that introduces a penalty term on the absolute values of the regression coefficients β j .Specifically, the L1 penalty term is given by: where λ is a non-negative regularization parameter, and β j represents each coefficient in the linear model.
The L1 penalty encourages sparse solutions by setting some coefficients to zero, effectively performing feature selection.
In R the regression objective function is: is the loss term, which measures how well the model fits the data.- is the penalty term.λ is the regularization parameter, and α is the mixing parameter between L1 and L2 penalties.
For the 'glmnet' package the parameter α controls the balance between Lasso (L1 penalty) and Ridge (L2 penalty).When α = 1, it's Lasso regression.When α = 0, it's Ridge regression.For 0 < α < 1, it's Elastic Net regression, which combines both L1 and L2 penalties.In our case α = 1 and we are using Lasso regression.[1] For λ selection the dataset is divided into 'k' subsets, and the Lasso model is trained on 'k-1' of these subsets and validated on the remaining one, cycling through all 'k' subsets and averaging the validation errors to select the best lambda.This suggests that there are no substantial multicollinearity concerns within these models, with the exception of the 'ageMinMaxNormalized' variable.In the VIF analysis, the variable 'ageMinMaxNormalized' exhibited a marginally high GVIF value (2.20 for 'dose2Mu' and 2.16 for 'dose3Mu'), just slightly above the conventional threshold of 2 [5][6][7].This prompts a discussion on whether to include or exclude this variable from the model.The slight exceedance of the threshold does not necessarily denote a substantial violation of the assumptions underlying the model.The variable 'ageMinMaxNormalized' is the focus of our analysis and excluding it based solely on a marginally high VIF value could lead to an incomplete or biased understanding of the underlying relationships.For this reason we include 'ageMinMaxNormalized' in the analysis.Each cell represents the proportion of individuals having a specific condition (represented by columns) given the presence of another condition (indicated by rows).The color gradient corresponds with the percentage: deeper red shades signify a higher percentage, while white denotes a lower one.This represents the conditional probability, in percentage terms, that an individual possesses the comorbidity listed in the row, given they have the one in the column.For example, if someone has Hypertension (row), the probability they also have Hypertension (column) is 100%.However, the probability that they have Diabetes (column) drops to 17%.The repetition of percentages across different conditions can be attributed to the limited count of individuals with certain diseases in the dataset.

S5 Multivariate statistical analysis
error models, and Lasso regression details S3 Population and individual model fits S4 Exploring the age cutoff for statistical comparisons S5 Multivariate statistical analysis S1 Comorbidity breakdown

Figure S1 :
Figure S1: Distribution of chronic conditions among individuals.The bars represent the percentage of diagnoses.The legend details the count of each condition.

Figure S3 :
Figure S3: Example individual fits from the primary series and booster dose Anti-S trajectory data.Vertical dashed lines from left to right indicate day of SARS-CoV-2 dose one, two, and three, respectively.

Figure S4 :
Figure S4: Population fits to all primary series and booster dose data.

Figure S5 :
Figure S5: Bonferoni-corrected P-values as a function of young/old age cutoff.

Figure S6 :
Figure S6: This figure illustrates the relationship between the logarithm of N-terminal domain (anti-N) antibody levels and spike protein (anti-S) antibody levels, with data points colored by COVID-19 test status.Vertical dashed lines represent thresholds used to infer assumed positive status before booster administration: low (dark red), alternative (light red), and high (green).These thresholds are determined by: Alternative Threshold (light red): Represented by the middle vertical dashed line.An arbitrary fixed value, An arbitrary fixed value, capturing the visual nadir in the distribution of anti-N results.Low Threshold (dark red): A value that's set at twice the lowest N-terminal domain antibody levels observed among those who tested positive for COVID-19.High Threshold (green color): Set by the highest N-terminal domain antibody levels observed among those who have not tested positive for COVID-19.Panel A and B represent pre-or post-booster dose data points, respectively.

Figure S7 :
Figure S7: Correlation heatmap illustrating relationships between various chronic commodities, demographic factors, and vaccine dose responses available in the data.Red indicates positive correlation, blue indicates negative correlation, and white indicates no correlation.

Figure S8 :
Figure S8: Variance Inflation Factor (VIF) Analysis.The figure displays GVIF (1/(2 * Df )) values for each variable considerd in the regression models.All variables, except for 'ageMinMaxNormalized', exhibit values below the threshold of 2. This suggests that there are no substantial multicollinearity concerns within these models, with the excep-

Figure S9 :
Figure S9: The heatmap portrays the percentage of individuals in our dataset with co-existing chronic diseases.

Table S1 :
Correlation relationships between various chronic commodities, demographic factors.

Table S2 :
Multivariate Linear Regression Coefficients and Significance